home *** CD-ROM | disk | FTP | other *** search
- BASEBALL.INF
- August 1, 1995
-
- JUST HOW DOES THIS THING WORK?
-
- A few people have asked how SBS goes about simulating a game, so I
- decided to write down a few notes for those interested.
-
-
- HITTING vs PITCHING - the first step:
-
- A hitter can do one of six things in SBS. He can:
- 1) Make an out.
- 2) Get a walk.
- 3) Get a single.
- 4) Get a double.
- 5) Get a triple.
- 6) Get a Home Run.
- (He could also get on base on an error but that is really a subset of
- making an out as far as the stats are concerned). SBS calculates from
- the hitter's record the probability of each occurrence. Then the
- pitcher's record is considered which modifies the probabilities. This
- will be demonstrated by working through an example. Consider the batter
- named Joe Hitter:
-
- AB Hits 2B 3B HR BB K AVG
- Joe Hitter 482 147 20 4 12 41 71 .305
-
- Compute PA (Plate Appearances) = AB + BB = 523
- (ignore sacrifice hits and Hit-By-Pitched-Balls which would make PA a
- little larger)
- Compute HBB (probability of Walk) = BB / PA = .078
- Compute H1 (probability of single) = (Hits - (2B + 3B + HR)) / PA = .212
- Compute H2 (probability of double) = 2B / PA = .038
- Compute H3 (probability of triple) = 3B / PA = .007
- Compute H4 (probability of HR) = HR / PA = .023
- -------
- .358
-
- What's left is the probability of making an out (or possibly getting on
- on an error): 1.000 - .358 = .642
-
- We have determined Joe's probabilities as a whole against all the
- pitchers he faced that particular season. But now we have to calculate
- probabilities given a particular pitcher. Consider a player named Jack
- Pitcher:
-
- IP Hits HR BB K
- Jack Pitcher 200 210 15 50 80
-
- If a pitcher threw 200 innings we know he got approximately 600 batters
- out. The ones he did not get out got hits or walks or reached on errors.
-
- Compute BF (batters faced) = (IP x 3) + Hits + BB
-
- Sometimes a pitcher gets a few extra outs such as double plays and
- runners getting thrown out on the bases. On the other hand sometimes he
- faces extra hitters because his defense makes errors. These two factors
- just about cancel each other out so we can leave the BF equation alone
- for purposes of explanation.
-
- We now need to compute the probabilities for the pitcher for the same
- events that we calculated for the batter.
-
- Compute BF (batters faced) = (IP x 3) + Hits + BB = 860
-
- Compute PBB (probability of Walk) = BB / BF = .0581
- Compute P1 (probability of single) = Hits x .717 / BF = .1751
- Compute P2 (probability of double) = Hits x .174 / BF = .0425
- Compute P3 (probability of triple) = Hits x .024 / BF = .0059
- Compute P4 (probability of HR) = HR / BF = .0174
-
- The statistics do not usually tell us how many singles, doubles and
- triples a pitcher allowed. Usually just the totals hits and home runs
- are given. However, we can use the multipliers .717, .174 and .024 to
- estimate singles, doubles and triples from the total number of hits.
- [If all "HR's allowed" are zero in the .DAT file, SBS uses a "league
- average" for P4 found from data in the .CFG file. This essentially
- removes any influence a pitcher has over this statistic, however].
-
- Comparing the percentages we obtained from the hitter with the pitcher
- we have come up with the following:
-
- Event Hitter Pitcher
- ------ ------ -------
- walk 7.72% 5.81%
- single 20.90% 17.51%
- double 3.77% 4.25%
- triple 0.75% 0.59%
- home run 2.26% 1.74%
- out 64.60% 70.10%
-
- At first glance it might seem that all we need to do now is average the
- hitters and pitchers events like this:
-
- single = (20.9% + 17.51%) / 2 = 19.21%
- etc. for the rest of the events
-
- This method is unacceptable, however, because it penalizes the outstanding
- players and rewards the poor players. That is, it tends to lump everyone
- together too much. In our example above we have a good hitter, (.305 vs
- the league) against a quite average pitcher. Certainly we could not
- expect his single% to DROP to 19.21% from 20.90%! After all if this is
- an average pitcher we would expect our batter to do at least as well
- against him as he did against the rest of the league!
-
- The solution is to use "league averages" for the events and to compare
- our pitcher's performance against the league averages. For example, we
- can pick up a baseball statistics magazine containing the statistics
- from the preceding year, and calculate the total "batter's faced" for
- all the pitchers for the entire season. We can also total the number of
- walks, hits, and home runs -- and estimate using our multipliers above
- -- the total number of singles, doubles, and triples. Then we can
- calculate our "league averages".
-
- For example, we find that for an entire season there was 17,500 innings
- pitched, 16,500 hits, 1450 Home Runs, 6,300 walks. Calculate League
- Averages:
-
- League Avg. BF = 17,500 x 3 + 16,500 + 6,300 = 75,300
-
- " " " " LABB = 6,300 / 75,300 = .0837
- " " " " LA1 = .717 x 16,500 / 75,300 = .1571
- " " " " LA2 = .174 x 16,500 / 75,300 = .0381
- " " " " LA3 = .024 x 16,500 / 75,300 = .0053
- " " " " LA4 = 1,450 / 75,300 = .0193
-
- Finally we can combine our hitter percentages with our pitcher
- percentages to get meaningful probabilities:
-
- Combined percentages:
- walk% = HBB * (PBB / LABB) = .0536
- single% = H1 * (P1 / LA1) = .2329
- double% = H2 * (P2 / LA2) = .0421
- triple% = H3 * (P3 / LA3) = .0083
- home run% = H4 * (P4 / LA4) = .0204
-
- Note that if the pitcher's percentages are nearly equal to the League
- Averages, the second factor becomes essentially 1 and the hitter
- performs as expected. But if the pitcher's percentages are substantially
- better (lower) than the League Averages, the second factor will be less
- than 1 and the hitter will suffer. The reverse is true if the pitcher's
- percentages are worse (larger) than the League Averages.
-
-
- RIGHTYS VS LEFTYS - the second step:
-
- The example above is the basis of how a given hitter is expected to
- perform against a given pitcher. But we also can fine-tune our model to
- correct for the baseball maxim that right-handed batters do better
- against left-handed pitchers and vice-versa. [A batter facing a
- like-handed pitcher suffers somewhat]. This is, of course, a very
- individual thing -- not affecting some players while severely affecting
- others. SBS does not know which players are exceptional in this area --
- the data files do not show a breakdown versus right or left handed
- opponents. But we can make some broad assumptions which are useful in
- large simulations.
-
- Approximately two-thirds of all innings pitched are by right-handers.
- Because a typical batter will see so much more right-handed pitching
- than left-handed pitching, his average vs. left-handed pitching will
- show a greater fluctuation.
-
- Consider the following typical scenario for a RIGHT-handed hitter:
-
- AB Hits Avg.
- _________________
- Total | 600 180 .300
- |
- vs. Right-Handed Pitching | 400 116 .290
- |
- vs. Left-Handed Pitching | 200 64 .320
-
- Notice that his boost vs left-handed pitching (20 points) is twice that
- of his penalty vs. right-handed pitching (10 points). This is because he
- sees approximately twice as much right-handed pitching over the course
- of a season.
-
- For the typical LEFT-handed hitter:
-
- AB Hits Avg.
- _________________
- Total | 600 180 .300
- |
- vs. Right-Handed Pitching | 400 124 .310
- |
- vs. Left-Handed Pitching | 200 56 .280
-
- Notice that his penalty vs left-handed pitching (20 points) is twice
- that of his boost vs. right-handed pitching (10 points). Again, this is
- because a full-time player sees twice as much right-handed pitching as
- left-handed pitching.
-
- Actually SBS does not reward and/or penalize by an absolute number of
- batting "points" but instead increases or decreases each of the hitter's
- probabilities by a percentage. SBS uses 3% and 6% for the percentages to
- reward or penalize. This adjustment is spread out among all the types of
- hits, not just singles.
-
-
- PITCHER FATIGUE FACTOR - the third step:
-
- We have looked at two of the three major factors used in determining the
- percentages of events. The last major factor is the Pitcher Fatigue
- factor. That is, how does the effectivity of a pitcher vary over the
- course of a game -- or when does he tire out. SBS uses a linear function
- for pitcher fatigue which makes a pitcher more effective than normal if
- the pitcher has worked less than about 4 innings and less effective than
- normal if the pitcher has worked more than about 4 innings. However SBS
- treats starting pitchers differently from relievers. SBS assumes
- relievers tire more quickly so the break-off point is about 2 innings
- instead of 4.
-
-
- PUTTING IT ALL TOGETHER:
-
- Lets say we now have arrived at our five percentages and have adjusted
- them the way we want for the righty/lefty effect and the pitcher fatigue
- effect. We now need to "stack" them on top of each other and calculate
- the "break" points. Now the percentages are some fraction between 0 and
- 1.
-
- bk1 = walk%
- bk2 = bk1 + single%
- bk3 = bk2 + double%
- bk4 = bk3 + triple%
- bk5 = bk4 + home run%
-
- We now call our random number function which returns a (pseudo) random
- number between 0 and 1 -- and compare it with our break points. If "x"
- is our random number:
-
- If x is greater than 0 but less than bk1 --> walk
- If x is greater than bk1 but less than bk2 --> single
- If x is greater than bk2 but less than bk3 --> double
- If x is greater than bk3 but less than bk4 --> triple
- If x is greater than bk4 but less than bk5 --> home run
- If x is greater than bk5 but less than 1 --> out OR on by error
-
- Handling the "outs" is a matter of deciding to which defensive player
- the ball was hit, whether or not it was a ground ball or a fly ball, and
- whether or not the defensive man handled it cleanly.
-
- Whether or not a player "pulls" the ball is influenced by the velocity
- of the pitcher. That is, his ability to "get around" on the pitcher. SBS
- uses a function based on a pitcher's strike out percentage to simulate
- his velocity and thus a hitter's ability to pull the ball. So a pitcher
- with big strike out totals is more difficult to pull than the pitcher
- who doesn't strike out many. OK, this is crude but its the best we've
- got!
-
-
- SPEED/STOLEN BASES:
-
- The .dat files contain a column for stolen bases which SBS translates
- (via a formula) into a general purpose speed rating (1-9). You can view
- this speed rating by popping up your lineup or the opponent's lineup.
- The numbers 1 through 9 are roughly proportional to a player's chances
- of stealing 2nd or 3rd base. That is, a player with a speed rating of 9
- would have approximately a 90% chance of stealing 2nd or 3rd. A rating
- of 5 would be a 50% chance, etc. If the opponent calls a pitch-out the
- chance of success is reduced by 25% (but the chance of walking the
- batter increases by 25%). The chance of success is also reduced if a
- runner on 1st is trying to steal against a left-handed pitcher or a
- runner or 2nd is trying to steal against a right-handed pitcher.
-
- The formula to calculate the speed rating from the .dat file information
- estimates the number of times the player made it to first base (ABs,
- singles & BBs) and then multiplies that by a fraction to estimate the
- number of steal opportunities the player might have had. Then it compares
- that estimate to the number of stolen bases and assigns a speed rating.
- The method is somewhat crude in that is penalizes fast players that
- happen to play on teams that just do not attempt to steal. On the other
- hand it helps players who are efficient at stealing bases even if they
- are not actually all that fast.
-
- The speed rating has other uses beside the stolen base. It helps
- determine the likelihood of the player hitting into a double play or
- successfully executing a drag bunt. It also helps determine if a runner
- goes from 1st to 3rd on a single or scores from 2nd on a single. Or if a
- runner on 1st can score on a double. (The number of outs and where the
- ball is hit is a big factor here also. For example, runners tend to play
- it safe with no outs but usually will try to take the extra base with
- two out).
-
-
- INFIELD "IN":
-
- This defensive strategy will cut off almost all runners who would
- otherwise score on a ground ball to an infielder [especially to the
- shortstop or second baseman]. The penalty is that the batter's base-hit
- percentages are adjusted upward dramatically because more ground balls
- will go through the infield. A .250 hitter becomes a .350 hitter. [When
- the player or the automatic manager chooses to play the infield in, it
- is only in effect for the current batter].
-
-
- AUTOMATIC MANAGER:
-
- SBS has not yet been programmed for brilliant management, but it does
- make basic offensive and defensive maneuvers.
-
- On defense SBS can:
- 1) pull the infield in
- 2) call for a pitch-out
- 3) call for an intentional walk
- 4) evaluate pitcher and go to the bullpen if necessary
-
- On offense SBS can:
- 1) pinch hit if a suitable pinch hitter is available
- 2) sacrifice bunt or squeeze bunt
- 3) steal
-
- The automatic manager does no pinch running or double-switch stuff. No
- "chess games" vs the other side. You will find that on offense the
- automatic manager is fairly aggressive. He is not shy about trying to
- steal bases or squeeze home a run.
-
-
- HAVE FUN:
-
- I hope this gives you a general idea of how the program operates. If you
- have questions, feel free to write.
-